Estimating Activity based on Visits to Points of Interest#

Estimating visits within or in the proximity to points of interest, such as residential, commercial and industrial zones can potentially shed light into the economic impacts brought by the earthquakes near Nurdağı, Türkiye. Similarly to Google Community Mobility Reports (now discontinued), the following working methodology seeks to capture the change in mobility (i.e., number of visits) within OpenStreetMap points of interest compared to a baseline (Jan-23).

Data#

In this section, we import from the data sources manipulated throughout the notebook. Note that data is a placeholder location where we recommended to place the interim data. When using non-public data, please carefully abide by the terms and conditions and data classification.

Hide code cell source
# https://papermill.readthedocs.io/en/latest/usage-parameterize.html
COUNTRY = "TR"
PANEL = "v2023.3.19"
TAG = "landuse"

Area of Interest#

In this study, the area of interest is defined as the affected 11 provinces in Türkiye and the area highlighted in northern Syria as shown below.

Make this Notebook Trusted to load map: File -> Trust Notebook

OpenStreetMap#

Using the Humanitarian OpenStreetMap’s Export Tool, the project team obtained a OSM dataset of points of interest within a clipping boundary defined by the area of interest between Türkiye and Syria.

POI = dask_geopandas.read_file(
    "../../data/external/export.hotosm.org/hotosm_worldbank_tur_points_of_interest_gpkg.zip!hotosm_worldbank_tur_points_of_interest.gpkg",
    npartitions=16,
)

To illustrate, we visualize below the points of interest tagged with landuse=industrial.

POI[POI[TAG].isin(["industrial"])].compute().explore(color="red")
Make this Notebook Trusted to load map: File -> Trust Notebook

Veraset Movement#

Veraset Movement provides a panel of human mobility data, based on data collection of GPS-enabled devices location.

ddf = dd.read_parquet(
    f"../../data/final/panels/{PANEL}",
    filters=[("country", "=", COUNTRY)],
)

Calculating date from the date and time the points were collected.

ddf["date"] = dd.to_datetime(ddf["datetime"].dt.date)

Methodology#

Similarly to Google Community Mobility Reports (now discontinued), the following working methodology seeks to capture the change in mobility (i.e., number of visits) within OpenStreetMap points of interest compared to a baseline (Jan-23). Note that the mobility data represents a subset of the total population in an area, specifically only users that turned on the Location Services device setting on their mobile device. This is not the total population density.

Using Dask-GeoPandas, we calculate the number of devices seen within the points of interest and aggregate by the landuse classication (e.g, residential), H3 index and date.

gddf = dask_geopandas.from_dask_dataframe(
    ddf,
    geometry=dask_geopandas.points_from_xy(ddf, "longitude", "latitude"),
).set_crs("EPSG:4326")

We execute a spatial join (without H3) to aggregate the number of devices for each spatial and temporal bin. In this case, we use H3 resolution 6 and a daily aggregation.

result = (
    gddf.sjoin(POI, predicate="within")
    .groupby([TAG, "hex_id", "date"])["uid"]
    .nunique()
    .to_frame(name="count")
    .reset_index()
    .compute()
)

Finally, the results joined into administrative divisions.

result = result.merge(
    AOI[["hex_id", "ADM0_PCODE", "ADM1_PCODE", "ADM2_PCODE"]], on="hex_id"
)

Results#

In this section, we visualize the daily number of devices detected within each of the following OSM landuse classification.

FEATURES = [
    "residential",
    "commercial",
    "industrial",
    "education",
    "farmland",
    "construction",
]
result.sort_values(["date"])
landuse hex_id date count ADM0_PCODE ADM1_PCODE ADM2_PCODE
0 cemetery 862da8837ffffff 2023-01-01 1 TR TUR027 TUR027008
7910 residential 862dac077ffffff 2023-01-01 2 TR TUR080 TUR080001
19140 residential 862da170fffffff 2023-01-01 64 TR TUR031 TUR031014
7815 forest 862dac00fffffff 2023-01-01 3 TR TUR080 TUR080003
7814 residential 862dac00fffffff 2023-01-01 1 TR TUR080 TUR080003
... ... ... ... ... ... ... ...
9632 residential 862dae807ffffff 2023-03-20 39 TR TUR031 TUR031013
20447 industrial 862da1247ffffff 2023-03-20 1 TR TUR031 TUR031009
13296 cemetery 862c3056fffffff 2023-03-20 14 TR TUR021 TUR021015
1188 residential 862d127b7ffffff 2023-03-20 5 TR TUR001 TUR001015
23051 residential 862da8907ffffff 2023-03-20 1 TR TUR027 TUR027008

32242 rows × 7 columns

Hide code cell source
def plot_visitation(data, title="POI Visitation"):
    """Plot number of visits to OSM points of interest"""

    p = figure(
        title=title,
        width=700,
        height=600,
        x_axis_label="Date",
        x_axis_type="datetime",
        y_axis_label="Number of devices",
        y_axis_type="log",
        y_range=(1, 10**5),
        tools="pan,wheel_zoom,box_zoom,reset,save,box_select",
    )
    p.add_layout(Legend(), "right")
    p.add_layout(
        Title(
            text=f"Number of devices detected within OSM '{TAG}' for each 3-day time window",
            text_font_size="12pt",
            text_font_style="italic",
        ),
        "above",
    )
    p.add_layout(
        Title(
            text=f"Source: Veraset Movement. Creation date: {datetime.today().strftime('%d %B %Y')}. Feedback: datalab@worldbank.org.",
            text_font_size="10pt",
            text_font_style="italic",
        ),
        "below",
    )
    
    # plot spans
    p.renderers.extend(
        [
            Span(
                location=datetime(2023, 2, 6),
                dimension="height",
                line_color="grey",
                line_width=2,
                line_dash=(4, 4),
            ),
            Span(
                location=datetime(2023, 2, 20),
                dimension="height",
                line_color="grey",
                line_width=2,
                line_dash=(4, 4),
            ),
        ]
    )
    
    # plot lines
    renderers = []
    for column, color in zip(FEATURES, COLORS):
        try:
            r = p.line(
                data.index[:-1],
                data[column][:-1],
                legend_label=column,
                line_color=color,
                line_width=2,
            )
            renderers.append(r)
        except:
            pass
        
    p.add_tools(
        HoverTool(
            tooltips="date: @x{%F}, devices: @y",
            formatters={"@x": "datetime"},
        )
    )

    p.legend.location = "bottom_left"
    p.legend.click_policy = "hide"
    p.title.text_font_size = "16pt"
    p.sizing_mode = "scale_both"
    return p

By total area#

By aggregating the visitation tally, we show a smoothed representation of how many users were detected within the total area for each 3-day time period.

data = (
    result.pivot_table("count", index=["date"], columns=[TAG], aggfunc="sum")
    .groupby(pd.Grouper(freq="3D"))
    .sum()
)
Hide code cell source
output_notebook()
show(plot_visitation(data))
Loading BokehJS ...

By first-level administrative divisions#

By aggregating the visitation tally, we show a smoothed representation of how many users were detected within each first-level administrative division and for each 3-day time period.

Hide code cell source
tabs = []

# iterate over first-level administrative divisions
for pcode in sorted(result["ADM1_PCODE"].unique()):
    data = (
        result[result["ADM1_PCODE"] == pcode]
        .pivot_table("count", index=["date"], columns=[TAG], aggfunc="sum")
        .groupby(pd.Grouper(freq="3D"))
        .sum()
    )
    tabs.append(
        TabPanel(
            child=plot_visitation(data, title=f"Points of Interest Visits in {pcode}"),
            title=pcode,
        )
    )

# plot panel
tabs = Tabs(tabs=tabs, sizing_mode="scale_both")
show(tabs)

Limitations#

Warning

Note that the mobility data represents a subset of the total population in an area, specifically only users that turned on the Location Services device setting on their mobile device. This is not the total population density.